Surabaya
AquaFusionNet: Lightweight VisionSensor Fusion Framework for Real-Time Pathogen Detection and Water Quality Anomaly Prediction on Edge Devices
Kristanto, Sepyan Purnama, Hakim, Lutfi, Hermansyah, null
Abstract--Evidence from many low-and middle-income regions shows that microbial contamination in small-scale drinking-water systems often fluctuates rapidly, yet existing monitoring tools capture only fragments of this behaviour . Microscopic imaging provides organism-level visibility, whereas physicochemical sensors reveal short-term changes in water chemistry; in practice, operators must interpret these streams separately, making real-time decision-making unreliable. This study introduces AquaFusionNet, a lightweight cross-modal framework that unifies both information sources inside a single edge-deployable model. Unlike prior work that treats microscopic detection and water-quality prediction as independent tasks, AquaFusionNet learns the statistical dependencies between microbial appearance and concurrent sensor dynamics through a gated cross-attention mechanism designed specifically for low-power hardware. The framework is trained on AquaMicro12K, a new dataset comprising 12,846 annotated 1000 micrographs curated for drinking-water contexts, an area where publicly accessible microscopic datasets are scarce. Deployed for six months across seven facilities in East Java, Indonesia, the system processed 1.84 million frames and consistently detected contamination events with 94.8% mAP@0.5 and 96.3% anomaly-prediction accuracy, while operating at 4.8 W on a Jetson Nano. Comparative experiments against representative lightweight detectors show that AquaFusionNet provides higher accuracy at comparable or lower power, and field results indicate that cross-modal coupling reduces common failure modes of unimodal detectors, particularly under fouling, turbidity spikes, and inconsistent illumination. All models, data, and hardware designs are released openly to facilitate replication and adaptation in decentralized water-safety infrastructures. Safe drinking water is a prerequisite for public health, yet it remains out of reach for a substantial fraction of the global population. Recent estimates from the WHO/UNICEF Joint Monitoring Programme indicate that 2.2 billion people still lack safely managed drinking-water services and that unsafe water, sanitation, and hygiene (W ASH) contribute to approximately 1.4 million deaths per year [1], [2].
- Asia > India (0.05)
- North America > United States > Michigan (0.04)
- Europe > Spain > Galicia > Madrid (0.04)
- (41 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.62)
Beyond One-Size-Fits-All: Personalized Harmful Content Detection with In-Context Learning
Zhang, Rufan, Zhang, Lin, Mi, Xianghang
The proliferation of harmful online content--e.g., toxicity, spam, and negative sentiment--demands robust and adaptable moderation systems. However, prevailing moderation systems are centralized and task-specific, offering limited transparency and neglecting diverse user preferences--an approach ill-suited for privacy-sensitive or decentralized environments. We propose a novel framework that leverages in-context learning (ICL) with foundation models to unify the detection of toxicity, spam, and negative sentiment across binary, multi-class, and multi-label settings. Crucially, our approach enables lightweight personalization, allowing users to easily block new categories, unblock existing ones, or extend detection to semantic variations through simple prompt-based interventions--all without model retraining. Extensive experiments on public benchmarks (TextDetox, UCI SMS, SST2) and a new, annotated Mastodon dataset reveal that: (i) foundation models achieve strong cross-task generalization, often matching or surpassing task-specific fine-tuned models; (ii) effective personalization is achievable with as few as one user-provided example or definition; and (iii) augmenting prompts with label definitions or rationales significantly enhances robustness to noisy, real-world data. Our work demonstrates a definitive shift beyond one-size-fits-all moderation, establishing ICL as a practical, privacy-preserving, and highly adaptable pathway for the next generation of user-centric content safety systems. To foster reproducibility and facilitate future research, we publicly release our code on GitHub and the annotated Mastodon dataset on Hugging Face.
- North America > United States > Washington > King County > Seattle (0.14)
- Asia > Middle East > Israel (0.14)
- Asia > China (0.04)
- (10 more...)
- Research Report > New Finding (1.00)
- Overview (0.92)
Seq-DeepIPC: Sequential Sensing for End-to-End Control in Legged Robot Navigation
The model jointly predicts semantic segmentation and depth estimation, giving richer spatial features for planning and control. For efficient deployment on edge devices, we use EfficientNet-B0 as the encoder, reducing computation while maintaining accuracy. Heading estimation is simplified by removing the noisy IMU and instead computing the bearing angle directly from consecutive GNSS positions. We collected a larger and more diverse dataset that includes both road and grass terrains, and validated Seq-DeepIPC on a robot dog. Comparative and ablation studies show that sequential inputs improve perception and control in our models, while other baselines do not benefit. Seq-DeepIPC achieves competitive or better results with reasonable model size; although GNSS-only heading is less reliable near tall buildings, it is robust in open areas. Overall, Seq-DeepIPC extends end-to-end navigation beyond wheeled robots to more versatile and temporally-aware systems.
- North America > United States (0.04)
- Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
- Asia > Indonesia > Java > East Java > Surabaya (0.04)
Automatic essay scoring: leveraging Jaccard coefficient and Cosine similaritywith n-gram variation in vector space model approach
Cahyani, Andharini Dwi, Fathoni, Moh. Wildan, Rachman, Fika Hastarita, Basuki, Ari, Amin, Salman, Khotimah, Bain Khusnul
Automated essay scoring (AES) is a vital area of research aiming to provide efficient and accurate assessment tools for evaluating written content. This study investigates the effectiveness of two popular similarity metrics, Jaccard coefficient, and Cosine similarity, within the context of vector space models(VSM)employing unigram, bigram, and trigram representations. The data used in this research was obtained from the formative essay of the citizenship education subject in a junior high school. Each essay undergoes preprocessing to extract features using n-gram models, followed by vectorization to transform text data into numerical representations. Then, similarity scores are computed between essays using both Jaccard coefficient and Cosine similarity. The performance of the system is evaluated by analyzing the root mean square error (RMSE), which measures the difference between the scores given by human graders and those generated by the system. The result shows that the Cosine similarity outperformed the Jaccard coefficient. In terms of n-gram, unigrams have lower RMSE compared to bigrams and trigrams.
- North America > United States > Texas (0.14)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- Europe > United Kingdom (0.04)
- (5 more...)
- Education > Assessment & Standards > Student Performance (1.00)
- Education > Educational Setting > K-12 Education > Secondary School (0.48)
- Education > Educational Technology > Educational Software > Computer Based Training (0.34)
- Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.34)
VeMo: A Lightweight Data-Driven Approach to Model Vehicle Dynamics
Oddo, Girolamo, Nuca, Roberto, Parsani, Matteo
Abstract--Developing a dynamic model for a high-performance vehicle is a complex problem that requires extensive structural information about the system under analysis. This information is often unavailable to those who did not design the vehicle and represents a typical issue in autonomous driving applications, which are frequently developed on top of existing vehicles; therefore, vehicle models are developed under conditions of information scarcity. This paper proposes a lightweight encoder-decoder model based on Gate Recurrent Unit layers to correlate the vehicle's future state with its past states, measured onboard, and control actions the driver performs. The results demonstrate that the model achieves a maximum mean relative error below 2.6% in extreme dynamic conditions. It also shows good robustness when subject to noisy input data across the interested frequency components. Furthermore, being entirely data-driven and free from physical constraints, the model exhibits physical consistency in the output signals, such as longitudinal and lateral accelerations, yaw rate, and the vehicle's longitudinal velocity. N the automotive sector developing a representative vehicle dynamics model is a complex and multifaceted challenge [1]-[3]. Numerous nonlinear factors influence vehicle dynamics, including tire characteristics, suspension geometry, aerodynamics, drivetrain effects, and external environmental factors, such as road surface grip conditions and climatic effects (e.g., wind). Accurately capturing these effects in a computational model requires high-fidelity multibody simulation software and a profound understanding of the vehicle system.
- Automobiles & Trucks (1.00)
- Transportation > Ground > Road (0.48)
- Leisure & Entertainment > Sports > Motorsports (0.46)
- Information Technology > Robotics & Automation (0.34)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Asia > India (0.05)
- Asia > Philippines (0.04)
- North America > United States > Michigan (0.04)
- (45 more...)
- Information Technology > Communications (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Detection and Measurement of Hailstones with Multimodal Large Language Models
Alker, Moritz, Schedl, David C., Stöckl, Andreas
This study examines the use of social media and news images to detect and measure hailstones, utilizing pre-trained multimodal large language models. The dataset for this study comprises 474 crowdsourced images of hailstones from documented hail events in Austria, which occurred between January 2022 and September 2024. These hailstones have maximum diameters ranging from 2 to 11cm. We estimate the hail diameters and compare four different models utilizing one-stage and two-stage prompting strategies. The latter utilizes additional size cues from reference objects, such as human hands, within the image. Our results show that pretrained models already have the potential to measure hailstone diameters from images with an average mean absolute error of 1.12cm for the best model. In comparison to a single-stage prompt, two-stage prompting improves the reliability of most models. Our study suggests that these off-the-shelf models, even without fine-tuning, can complement traditional hail sensors by extracting meaningful and spatially dense information from social media imagery, enabling faster and more detailed assessments of severe weather events. The automated real-time image harvesting from social media and other sources remains an open task, but it will make our approach directly applicable to future hail events.
- North America (0.04)
- Europe > Switzerland (0.04)
- Asia > Indonesia > Java > East Java > Surabaya (0.04)
- (4 more...)
MAD: Manifold Attracted Diffusion
Elbrächter, Dennis, Alberti, Giovanni S., Santacesaria, Matteo
Score-based diffusion models are a highly effective method for generating samples from a distribution of images. We consider scenarios where the training data comes from a noisy version of the target distribution, and present an efficiently implementable modification of the inference procedure to generate noiseless samples. Our approach is motivated by the manifold hypothesis, according to which meaningful data is concentrated around some low-dimensional manifold of a high-dimensional ambient space. The central idea is that noise manifests as low magnitude variation in off-manifold directions in contrast to the relevant variation of the desired distribution which is mostly confined to on-manifold directions. We introduce the notion of an extended score and show that, in a simplified setting, it can be used to reduce small variations to zero, while leaving large variations mostly unchanged. We describe how its approximation can be computed efficiently from an approximation to the standard score and demonstrate its efficacy on toy problems, synthetic data, and real data.
OVGrasp: Open-Vocabulary Grasping Assistance via Multimodal Intent Detection
Hu, Chen, Luo, Shan, Gionfrida, Letizia
Grasping assistance is essential for restoring autonomy in individuals with motor impairments, particularly in unstructured environments where object categories and user intentions are diverse and unpredictable. We present OVGrasp, a hierarchical control framework for soft exoskeleton-based grasp assistance that integrates RGB-D vision, open-vocabulary prompts, and voice commands to enable robust multimodal interaction. To enhance generalization in open environments, OVGrasp incorporates a vision-language foundation model with an open-vocabulary mechanism, allowing zero-shot detection of previously unseen objects without retraining. A multimodal decision-maker further fuses spatial and linguistic cues to infer user intent, such as grasp or release, in multi-object scenarios. We deploy the complete framework on a custom egocentric-view wearable exoskeleton and conduct systematic evaluations on 15 objects across three grasp types. Experimental results with ten participants demonstrate that OVGrasp achieves a grasping ability score (GAS) of 87.00%, outperforming state-of-the-art baselines and achieving improved kinematic alignment with natural hand motion.
- Europe > Switzerland > Zürich > Zürich (0.14)
- South America > Uruguay > Maldonado > Maldonado (0.04)
- North America > United States > California (0.04)
- (4 more...)
- Information Technology (0.68)
- Health & Medicine > Therapeutic Area > Neurology (0.46)